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Abstract 

We investigate the Bayesian approach to nonUnear inverse problems by 
analysing it from a frequentist's perspective. We show that the posterior 
concentrates around the truth if the noise disappears or the amount of 
data goes to infinity. 

We consider the inverse problem of reconstructing the diffusion coef- 
ficient from noisy observations of the solution to an elliptic PDE in di- 
vergence form. This example is used to illustrate a link between stability 
results for deterministic nonlinear inverse problems and posterior consist- 
ency for Bayesian nonparametric regression. We obtain posterior consist- 
ency under weak assumptions on the prior. We obtain an algebraic rate 
for provided there are appropriate asymptotic lower bound for small ball 
probabilities of the prior. We establish them for a popular class of priors 
for the elliptic inverse problem. For this reason we prove posterior con- 
sistency for Bayesian nonparametric regression under weak assumptions 
on the prior and Gaussian observational noise. Our main contribution 
here is that our results holds under weak assumptions on the prior. In 
particular Gaussian priors satisfy our assumptions. For a particular class 
of Gaussian prior and noise studied in literature our rates are close to the 
optimal minimax rate. 

An insightful example of posterior inconsistency is provided for the 
regression problem with pointwise observations. 

1 Introduction 

Mathematical models build not only the foundation for scientific research in 
areas such as physics, chemistry, economics, medicine and life sciences but also 
for various processes in technology and the industry. These models contain often 
unknown parameters or input such as initial conditions. The Bayesian approach 
of the reconstruction of these parameters is based on a prior distribution on these 
unknowns |23[ I33|. Together with the distribution of the noise, there exists a 
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unique conditional distribution on the parameters given the data. This distri- 
bution is called the posterior and is the solution of a Bayesian inverse problem. 
Because the posterior is only expressed 'implicitly as an unnormalised density 
with respect to the prior, it has to be probed using sampling and variational 
methods. This approach can be seen as an extension to many classical methods 
for inverse problem. For example it can be linked to the L^-Tikhonov regu- 
larisation by approximation of the maximum a posteriori probability (MAP) 
estimator Stability results are important for these methods since they 

confirm their reliability |10| . In a similar way posterior consistency results are 
needed to justify the Bayesian approach. There exists a variety of results, tools 
and ideas for posterior consistency and inconsistency for statistical problems 
|161 [HI IZl [55J Similar ideas have recently been applied to linear Bayesian 
inverse problems [25- However, most consistency results for linear Bayesian in- 
verse problems have been obtained using the explicit structure of the posterior 

illll la- 
in this article we tackle the consistency of nonlinear inverse problems mo- 
tivated by reconstructing the diffusion coefficient from measurements of a linear 
second order elliptic PDE. Our method is generally applicable to inverse prob- 
lems for which deterministic stability results are available. 

We first prove posterior consistency for Bayesian nonparametric regressions. 
The stability results allow us to transfer the consistency from these problems 
to the problem at hand. Moreover, we prove novel asymptotically lower bounds 
on the small ball probability for a class of priors that are popular for the elliptic 
inverse problem. Using these asymptotics we obtain an efficient rate for this 
problem. Whereas our main aim is to weaken the conditions implying posterior 
consistency, the rates that we obtain are close to optimal for example for the 
simultaneously diagonalisable linear Gaussian inverse problem. 

For completeness we give first a brief overview over Bayesian inverse prob- 
lems, precisely define posterior consistency in this setting and place it within 
the literature. We setup an elliptic inverse problem which is the motivating 
example for us and which will guide us through our main results. 

1.1 Review of the Bayesian Approach to Inverse Problems 

Let a (z X he the input of a mathematical model that is an initial condition for 
an ODE, a PDE or a discrete scheme and let y = Q{a) be the data where X and 
Y are Banach spaces. The inverse problem is concerned with reconstructing 
given the data y. Because Q can be non-injective and there is always an error 
in the data, the problem is not exactly solvable as stated. It can be tackled by 
modelling the observational noise ^ such that the data is generated by 

y = 5{a) + C 

The prior /io(da) is a probability measure containing all the a priori inform- 
ation on a. If additionally the forward operator Q is given and the distribution 
of ^ and Q satisfy mild assumptions, then there exists a conditional probabil- 



2 



ity measure on a, called the posterior [33]. It incorporates all the information 
contained in the data about the input a. 

We assume that the law of the observational noise is Gaussian with distri- 
bution — A/'(0,r) and that samples of QQ are /ip-almost surely elements of 
the Cameron-Martin space {H^j,^, ||-||p) associated to the noise ^. If G maps to 
a finite dimensional space, the posterior takes the form [33] 



^(a) oc exp ^-^||^;(a) - y\\l^ cx exp (^-^ ||^^(a)||r + {y,G{a))Y^ ■ 

The last expression is also valid for forward operators that have infinite dimen- 
sional range [33 . We summarise the above as 



Prior a ^ /iq 

Noise f -7V(0,r) 

Posterior |^(a) cx exp(-i||g(a)||^ + {y,g{a))^ 



(1.1) 



1.2 Posterior Consistency and Identical Twins 

We consider a fixed known input and generate the data using the model Q. 
The ability of a method to recover in this identical twin experiment is called 
consistency. For the Bayesian approach this corresponds to the posterior being 
more and more concentrated around . Motivated by the statistical literature, 
we give a precise definition of this property below 

Definition 1. (Posterior Consistency) 

A sequence of inverse problems {pQ,Q„, is posterior consistent with 

rate e„ and input with respect to a metric d if for 

there exists a sequence ^„ -> 1 such that 

P^„(Ai^"(Sfj>/„)^l. (1.2) 

1. Posterior consistency in the small noise limit corresponds to 

£(e„) = C{^0 and g„ - g. 

2. Posterior consistency in the large data limit corresponds to 

n 

c{u = ®r=i>c(e) and gn=Ylg^■ 
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This definition is analogue to the notion of posterior consistency in [TT] . In 
this research article the authors the problem of recovering a probability distri- 
bution from i.i.d. samples: 

Let V he a, set of probability measures and n„ be a prior on V. The data is 
given as i.i.d. samples {Xi} of Pq e V. In this setting they provide results of 
the form 

n„(P : d{P, Po) > e„|Xi, . . . , X„) ^ in P^ - probability. 

These results about posterior consistency can be applied to Bayesian inverse 
problems [28, by considering 

r={Pa\a e X} with C{g{a) + 0- 

Posterior consistency in both, the small noise as well as in the large data limit 
for infinite dimensional inverse problems, have mostly been studied for linear 
inverse problems |24l [T] |^ where the prior is either a sieve prior, a Gaussian 
or a wavelet expansion with uniform distributed coefficients. It is also possible 
to obtain a rate of posterior consistency by choosing e„ — > as n — > c» if there 
is an asymptotic lower bound on small ball probabilities |36l 125] . 



1.3 Motivation - An Elliptic Inverse Problem 

The determination of the diffusion coefficient a given noisy observations of the 
pressure p ^ [301 IHl 122 is a well-known inverse problem. In particular it be- 
comes relevant in oil reservoir simulations and the reconstruction of the ground- 
water ffow [201 . In this section we consider this problem in an idealised setting. 
We assume that the relation between p and a is given by the following elliptic 
PDE with Dirichlet boundary conditions. 

■ {aVp) ^ f{x) in D, p^QondD (1.3) 

where Z? is a bounded smooth domain in . We suppose that / is smooth, 
satisfies f > If > and we denote the solution operator to ( 1.3 1 by p{x; a). The 



inverse problem is concerned with reconstructing given observations based on 
p: 

Assumption 2. The forward operator can be split into a composition of the 
solution operator p and an observation operator O 

e„(a) =0„(P(-;a)). (1.4) 

In particular we consider 

1. On = Id which corresponds to a full observation with the noise corres- 
ponding to an additive random field 
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2. On — {G-xiYi^i with corresponding to evaluations at Xi G D. Then the 
data takes the form 

Vn = {b(a;i;a))}r=i +i- 

The posterior consistency for the eUiptic inverse problem is based on a stability 
result of the noiseless inverse problem j29] and the fact that Bayesian inverse 
problems behave well under transformations (c.f. Theorem [B]). This allows us 
to recast the problem 



Prior /io on a 

Data y ^ g„(a) + e„ = 0„(p(-, a)) + in ~ AA(0, r„) 
Posterior oc exp (-5 ||^n(a)||p^ + {y,Q{a))^ 



EUiptic 
Inverse 
Problem 



as a Bayesian nonparametric regression problem 



Prior 




P{-; ■)*fJ'0 on p 


Data 


y = On{p) 


+ e.,^n^AA(o,r„) 


Posterior 


d^, (P) « exp ( 


i||0(p)||^^ + (y,0(p))rj. 



Bayesian 
Regression 
Problem 



For On = Id we prove posterior consistency under weak assumptions on 
the support of the prior regularity and the tail behaviour of the prior. For 
the inversion of point observations we show posterior consistency under strong 
assumptions on the prior and the properties of the sequence {xi}. The latter are 
justified by constructing a counterexample if {xj} is just dense. In both cases we 
prove a rate of posterior consistency if we know an asymptotic lower bound of 
the small ball probability of a certain type. For a popular class of priors |30i[^ 
we provide a novel asymptotic lower bound on the small ball probability that 
makes the results explicit for this situation. 

1.4 Consistency through StabiHty 

In the following theorem we summarise the link between stability and posterior 
consistency in an general abstract theorem 

Theorem. Suppose g„ = On ° G with G : {X,\\\\x) {Y,\\\\y) and O : 
(Y, \\\\y) ^ {Z, llllz). Moreover, given a stability result of the form 

\\ai ~ a2\\x < b{\\G{ai) - G{a2)\\Y) 

where b : M.'^is increasing, 6(0) = 

and {Gi,iiQ,On,^iin)) is posterior consistent with respect to \\ ■ ||y and rate e„, 
then {fJ.Q,gm ^iin)) is posterior consistent with respect to \\ ■ \\x and rate 6(e„). 



Proof. Analogue to Section 1.3 we denote the posteriors /i^" and /2^ for (/io, Gn, ^i^n)) 



and {Gi,fio,On, C{in)) respectively. We note that Theorem |B] implies 
^^y{B{^^^Ja^))=f^y{BYJGia^))). 

□ 
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1.5 Outline 



In Section[2]we use the link between stability and posterior consistency to obtain 
posterior consistency with rates for the Bayesian approach the elliptic inverse 
problem. For this we provide novel posterior consistency results for Bayesian 
nonparametric regression in Section [3] In Section [4] we draw a conclusion and 
describe avenues of further research. A review and the notation for infinite 
dimensional Gaussian measures and Hilbert scales can be found in Appendix [X] 

Acknowledgement. The author would like to thank Professor Andrew Stuart 
and Professor Martin Hairer for helpful discussions and proof reading of the 
paper. SJV is supported by ERG. 



2 Consistency for the Elliptic Inverse Problem 

In this section we derive posterior consistency results for the Elliptic Inverse 



Problem introduced in Section 1.3 These are based on the posterior consistency 
results for the Bayesian Regression Problem given in Section [3] 

Since existence and regularity results for second order elliptic equations are 
well known, we will refer to the appropriate statements in |11II18] . Taking a as 



the unknown in Equation (1.3 1 leads to a hyperbolic PDE. Under the follow- 
ing Assumption there exists a unique solution a [29j (no additional boundary 
conditions are required). 

Assumption 3. (Forward conditions) 

1. D is compact, satisfies the exterior sphere condition and has a smooth 
boundary. 

2. f is smooth in D. 



3. a > a,nin > and f > /,„i„ > m Equation {1.3) 
This establishes the following stability result |29| . 



Proposition 4. If p arises as a solution to Equation (1.3) with a as diffusion 
coefficient satisfying Assumption^ then 

-V • (aVp) = f{x) in D, p = on dD 

for any f G L°°{D) is uniquely solvable for a with 

< D{armn, fnizn, 1 1 1 1 oo ) 1 1^1 1 oo ' 

Proposition 5. If Assumption^holds andp{-,ai) andp{-,a2) are solutions to 



Equation (1.3), then 

\\ai -a2lL < C'lhillci ■ lb(->«i) -p(-:a2)|lc2 • 

This proposition and Theorem [B] allow us to transfer consistency results for 
the Bayesian Regression Problem to the Elliptic Inverse Problem. 
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2.1 Posterior Consistency in the Small Noise Limit 

The results in Section |3] are formulated in terms of an abstract Hilbert scale. 
We consider ^ ~ A/'(0, (— ^)Dh.ichiot) because in this case classical results from 
[T5] can be used to satisfy the regularity assumptions in Section |3] The 
reason for this is that the abstract Hilbert scale T-L" corresponds to the the 
Sobolev space H'^". We will first derive a posterior consistency result without 
explicit rate if the prior just satisfies a uniformly a.s. bound on an appropriate 
Holder norm. We obtain an explicit posterior consistency rate for a class of 
priors as considered in |31l In order to obtain these results, we derive a 
novel asymptotic lower bound on the small ball probability. 



For this problem Assumption 18 holds with do = The reason for this is 



that the operator (— A)p[j.j^jjjpj has eigenvalues A^, with A^, x /c ^''Z'' where d is 
the dimension of the domain D |39l [55] . 

Theorem 6. Suppose that the prior /io satisfies 

a > A > and ||a||p„ < A /io a.s. 

for a > and a ^ N and the noise is given by ^ ^ A/'(0, {— ^)]yirichiet) ■ V 
a > 7' + I — 2, then the Elliptic Inverse Problem is posterior consistent with 
respect to the C^-norm for any a < a. 



Proof. Subsequently, C will denote a generic constant in different contexts that 
may change form line to line. We will first show posterior consistency in L°° and 
then use an interpolation inequality to bootstrap to C". In order to show pos- 
terior consistency of the Elliptic Inverse Problem in the L°°-norm, it is enough to 
show posterior consistency of the Bayesian Regression Problem in the C^-norm 
since 

i,y- {Br («^)) ) = (p(sr («^))) > (Bf ip^)) (2.1) 

by an application of Proposition |5] and Theorem [b| Using Theorem 6.19 in |18| . 
we know that 

||p||jj=+2 ^ < K P-0 a.s. 

Since a + 2 > r, pis (Iq a.s. an element of the Cameron-Martin space of /ij as it 
corresponds to |3|. Posterior consistency of the Bayesian Regression Problem 
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because a > r + ^ 



with respect to iJ''-norm is now implied by Theorem 

implies 

a + 2 ^ d 

S > 1 + — = 1 + (To- 

r 2r 

In order to bootstrap to posterior consistency in the C^-norm, we use a general- 
isation of the Sobolev embedding theorem to Besov spaces and an interpolation 
inequalities between Besov spaces on domains [34 . We first note that — 
and — -B^oo for r ^ Z. In particular Theorem 4.33 in [34J implies that 



5 I < C\\g\ 
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for 7 > chosen small. If r > f + 2, then we can conclude posterior consistency 
in C^-norm because for 7 small enough 

(pt)) D {pI||p-p1^^-J-, < ^} ^ {p\\\p-P^\\h^ < (2.2) 

Otherwise an interpolation inequality between Besov spaces (see Theorem 4.17 
in [34]) yields 

\\g\\c2+^ < ||5|L'-f-T||ff||sl+^ (2.3) 



for 7 small enough and with 6 = — — -§ . Similar to Equation (2.2 1 



e 

bZI^'' - KCl 



The Equation's (2.2 1 and (2.4 1 allow us to bootstrap the posterior consistency 
of jiy^ to C^. By Equation implies posterior consistency of /^^" in L°°- 

norm. Similarly, we bootstrap to a < a using the same interpolation technique 
as above. □ 

2.2 Posterior Consistency Rate for Uniform Prior 

We assume that the prior is given by 

Ho = C {^o{x) + 'YyiiZii'i{x)^ Zi^''''^' U[-l,l] (2.5) 

with ||'0i(2;)||c'/3 = Ij 7i > and S = X^i^i 7* Similar priors were used in 

|3H I22] . We assume that ao is chosen such that 



< Cinin < a < Omax MO a.S.. 

We will derive a small ball asymptotic for the prior above and follow the ar- 
gument of Theorem [6] more carefully to obtain a rate. Since the series above 
is absolutely converging, we can rearrange the sum and assume with loss of 
generality that 7^ is non-increasing. We denote 

in order to use a classical result from approximation theory |7j 

[Y. < N'-'^ S,. (2.6) 

\n>N / 
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Assumption 7. There is an < s* < 1 such that for all s > S* 

Ss < CO. 



Theorem 8. Suppose /ip is given as in Equation {2.5) and 

oo 

= ^^'-ftzl^i{x), with z\ E [-1, 1]. 
1=1 

Then for any s > 5* 

Proof. We obtain an asymptotic lower bound on the small ball probability by 
prescribing Zi for i — I . . . and ignoring the higher modes. Let s"^ < S < S 
then Equation (2.6 I implies 



n>Ng 

for Nf, > (^f^) . We find a subset of (a^) by prescribing one sided 
intervals for Zj 

Bf{a^) D {{zj >0A zl-^<z,< zj) V {zj < A zj + ^ > z, > zj) t = 1 . 
Hence 

t^oisfia^) > {^r- 



□ 



The regularity results in |18] imply the following continuity result 

\\p{ai) -p{a2)\\(jp+i < C||ai - aallp^. (2.7) 

This results implies that //q satisfies the same asymptotic lower bound on the 
small ball probability for || • H^ti+i as fj,Q for || • Ij^f!. In the following we illustrate 
how this can be used to obtain a rate in Theorem [g] We denote by and Ky 
the algebraic rate of posterior consistency of /x*'" with respect to the X-norm 
and the rate of posterior consistency of p.^" with respect to y-norm respectively. 

Theorem 9. Suppose that the prior hq as in Equation (2.5) and Assumption^ 
is satisfied. Additionally, we assume that for a > /3 > r + 1 and a > r + | — 2 

||a|U — ^ 

Then fi^" is consistent with respect to the i°°-norm with rate 

,,,'l-s a-r + 2 

< : A 1 A 



2 + ^- r"^ / V2-s"2a + d-2r + 4 
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Proof. In order to use Theorem |20] we need the small ball probability in . 
This follows from Theorem |8] and Equation (2.7) if/3 + 1 > r and the rate of 
convergence is 



^ 2 + 2(2- A) 



1 



Using Equation (2.4 1, we see 



with p — x^— j- and A — ''^+2 — ^ 
that the rate of posterior consistency for fl^ in C'^ is bounded by below 



2 -s 

a+2-r 



2a + d-2r 



with 9 



a+2+'t-r • same rate holds for k^oo because of Equation (2.1 ). □ 

2.3 Posterior Consistency in the Large Data 

In this section we consider only the case d — \ with D — [0, 1] because the 
general case is similar. Again we combine a prior fio(da) with the observations 



■ p{xi;a) + ^ 1 = 1, 



In order to combine Proposition [s] and Theorem 15 we suppose that {xi\ satisfy 
Assumption [T3| 



Theorem 10. Suppose {x;} satisfies Assumption 13 and for ^ > 1 \\a.\\^,^ < 



S and a > a,nin p^o-a.s.. Then the associated Bayesian inverse problem is 
consistent in the large data limit with respect to C for any 7 < 7. 

Proof. By Theorem 6.13 in [18 there is a C{D,j) such that for each a 

\\<^\\c-y - ^ ^^^^ " - 

there is a unique solution ||p||(^2+-, < C. Thus //q satisfies the assumptions of 
Theorem [T5| and /i^'i " is posterior consistent in L°°. Using the interpolation 
inequality between L°° and C'^^'', we get consistency in C^. As a consequence 
of Proposition [5] /i^^-" is consistent in L°°. We can bootstrap from L°°{D) to 
by interpolating between L°° and C. □ 



3 Posterior Consistency for Bayesian Regression 

In this section we study posterior consistency of the Bayesian Regression Prob- 
lem as defined in Section [l.3| The power of these results lies in the combination 
with results like Proposition [s] from [55] . These allow to conclude consistency 
for the apparently harder problem of posterior consistency for nonlinear inverse 
problems like the Elliptic Inverse Problem. Even if the prior is explicit for the 
Elliptic Inverse Problem the push-forward prior for the Bayesian Regression 
Problem is very implicit. Therefore our aim is to obtain posterior consistency 
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under weak assumptions on the prior. . We consider regression models with 
response 

Vn = On{p) + 

1. corresponding to evaluations of a function with i.i.d. Gaussian noise of 
identity (©„ = (e^JiLi ^^id = and n tends to infinity. 

2. where the response y is a function {On = Id) and the noise is Gaussian 
random field that is scaled to zero f„ — n^2^. 

For the first case we assume an almost sure upper bound on a Holder norm for 
the prior and an additional condition on the of locations of the observations. 
Without the latter assumption we give an example of posterior inconsistency. 
For general priors this has mostly been investigated for random evaluation points 
Xi, in statistical terminology random covariates |32) 138]. In j4] posterior con- 
sistency with respect to for deterministic Xi is shown without a rate (e„ — e 
in Definition [T| . The assumptions are similar because Assumption 13 corres- 
ponds to Assumption N RDd in [3]. However, their method is quite different 
and is based on the construction of appropriate tests. We provide an insight- 
ful example of posterior inconsistency if the location of observations does not 
satisfy Assumption |13[ There is much literature on posterior inconsistency on 
identifying a probability distribution from samples. The results mostly concern 
discrete probability distributions [131 [T5J [HJ IH] • 

In the second case we prove a consistency result with respect to the Cameron- 
Martin norm of the noise which is obtained with assumptions on the prior 
depending on its smoothness and tail behaviour. The smoothness of the error 
distribution is measured in the Hilbert scale ((H*, (•, ■)^))tsM. (which we review 
in the Appendix) . Additionally, we assume draws of the posterior almost surely 
and is in the Cameron-Martin space of the Gaussian noise. 

The second case is to the best of our knowledge not well studied in the stat- 
istics literature. Even Functional Data Analysis as described in |12|f27] concerns 
pointwise observations. However, it is a special case of linear Bayesian inverse 
problems which have been studied using explicit Gaussian methods in [TJ [21] 
and using methods similar to [,37^ for sieve, Gaussian and wavelet priors in |28| . 
If the prior is Gaussian such that the covariance operator of the prior and noise 
are simultaneously diagonalisable, our rates are close to the optimal minimax 
rate obtained in [TJ [23] • 

For both cases we are able to obtain a rate assuming that appropriate asymp- 
totic lower bounds on small ball probabilities around a) are available: 

Assumption 11. (Lower Asymptotics on Small Ball Probabilities) 
The prior mass to balls around the truth satisfies 
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In both cases posterior consistency in stronger norms can be obtained using 
prior or posterior regularity with interpolation inequalities which is outlined in 
Section 13.31 

3.1 Pointwise Observations in the Large Data Limit 

We consider the following nonparanietric Bayesian regression problem 



Ui = a{xi) + £_.i i := l...n 

where ''^^ Af{0,a^) and we suppose a prior /l(o on a G C([0, 1] ,M). In this 
case the posterior takes the form 

We suppose that the data is actually generated by the model with input . 
Hence it follows that 

Ui = a\xi)+^i 
d^y^-^ ( {a{xi)~ a){xi)f + {a{xi) ~ a){xi))£,i\ 
^ " )■ ^'-'^ 

In this setup the posterior consistency depends on the prior as well as on the 
sequence {2;^}^^^^. It is necessary to assume that the truth lies in the support 
of the prior fiQ. We pose the following strong assumptions on the prior: 

Assumption 12. Assume there exists 13 E (0, 1] and 5 > such that 

\\a\\p < S and \\cl\\^ < B /iq a.s. 
and that the truth satisfies 

One could think that it is sufficient for {xijjgpj to be dense in [0, 1] because 
it is possible to reconstruct the value of a^{x) from y^. Letting x e [0,1] be 
arbitrary, there are \xn- ~ x\ < such that 



J 



1 

a{x) = lim -y^a{xnA- 



However, we have to impose much stronger assumptions and we will give reasons 
for these below. 
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Assumption 13. We suppose that for {a^ijjgf^ there exists a K > such that 
for any a < 6 G [0, 1] there is an N(a, h) such that 

Fn{b) - F„(a) > K{b - a) Vn > N{a, b) 

where Fn denotes the empirical distribution of {xi}^^^. 

In U an even stronger Assumption is posed 

Assumption 14. For each interval I C [0, 1], let \H\ be its Lebesgue measure. 
Suppose that there exists a constant K such that whenever \I\ > there is at 
least one i such that Xi G /. 



Assumption 14 implies Assumption 13 by chopping an interval [a, b] into 



subintervals. We are able to obtain posterior consistency under Assumption [13] 
and a rate posterior consistency if Assumption 14 is satisfied. 

Theorem 15. Suppose that the Assumptions and \1S\ are satisfied, then 



fj^^ " is posterior consistent in L°° . Moreover, if Assumption 11 12 and 1^ 
satisfied then /i^i-" is posterior consistent in L°° with any rate n^" where 

Proof. Fix an arbitrary small 7 > and for notational convenience we denote 
hy h ^ a — and by 77 a generic A/'(0, cr^) random . Let S = \/'Yl^=i ^i^i)"^ 
then the posterior (3.1) takes the form 



oc C{n, rj) cxp{—S^ + Srj). 



From now on we work on Qn = {?? < n'^} and note that /i^(J7„) 0. Thus for 
< L < 1 we have a lower bound on 



/i«-"(i3,(at)) > C(n,Oexp ( -n^ - Ml!!! ) ^^oiBUa^)). 



Now we derive an upper bound on /i^i " {B,{a^Y). Let a e B,{a'^Y be chosen 
arbitrarily. Then there is x such that |a^(i;) — a{x) \ > e and applying the Holder 
continuity yields 

|a^(a;) — a{x)\ > e/2 for x (x — Ax, x + Ax] 

with Aa; = { js)'^ ' ^'^^ ^ ^^^'^ following index set 

/ = {i\xi e (x — Ax, X + Aa;]} . 
For n larger than — max {iV(iAx, (i + l)Aa;)|i = . . . [l/AxJ — l} we have 

k]-Ax <Fr,(x + Ax) - Fr,ix- Ax) ^ —. 
2 n 
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If we only consider the Xi for i G J, we obtain that 



We note that f{S) = -S'^ + Sn' is decreasing for S > n' jl. Thus 

,„ /-r, / ^/ ,x / ne'^KAx ,K ^ ,ie i, 
tiy^-"{B,{a^Y) < C{n,Ocxp i + {-Ax)^--n2 + 

By choosing small enough we obtain 

, „ , , , , as n -> oo. 

In order to obtain a rate of posterior consistency we use k > k 



1 



C(n,^)exp(-^n 



2(72 



(3.2) 



If 1 — (2 + > \ ~ k(1 + ^) we get for the upper bound 



M^-"(i?„-.(at)=) <C(n,e)exp 



8(45) 



72(45') : 



(3.3) 

The first term in the exponential in the lower bound (3.2 1 is dominant over the 
corresponding term in (3.3 1 by choosing 

k k(1 



2/3 



+ 7- 



Moreover, the first term in (3.2) and (3.3) are dominant if 

1 — 2k > kp 

1 - k 

I -2k > ^-+7 

l-(2+^)« > ^--(1 + ^). 

Choosing 7 small enough, we see that /i^" is consistent in with any rate 

1 2/3 



K < min{ 



3(l+2L)'(2/3 + l)(2 + p) 



}• 



□ 
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3.1.1 Example of Posterior Inconsistency 

From any dense sequence {xi}f^i it is possible to extract a subsequence satisfy- 
ing Assumption [13] All other observations can be viewed as additional observa- 
tions. We will provide an example to show that posterior inconsistency through 
these additional observations. 

By choosing the prior concentrated on a piecewise linear function g on 0, ^ 
and 1 with g{-^) = and assuming that = 0, the two dimensional example 
below can be extended to the setting above. This extension is an example of pos- 
terior inconsistency in L^'-norm for 1 < p < oo since on these class of functions 
all these norm are equivalent to any 2 dimensional norm on the endpoints. 



Example 16. We consider the following prior on 



p2 



Mo = C'X!^(;^,o) exp(-2fc^) + <5(_^,i) exp(-fc2) 

k=l 

and choose = (0,0) in the support of ^o- Given n and measurements 

of 

Ui = a{+ Ci and jji ^ al + |i with 1; ''^ ' 7V(0, 1) 
respectively. Consequently, the posterior takes the form 



/ n n 



2^'^ ^-^ 2 



where we suppose < 6* < 1. Here posterior consistency corresponds to that for 
any K there is ^„ t 1 such that 

P„ ^^(J'l -Si:-.") ^ (J (^,0)^ > ^ 1 for n^oo. 

We will show that there is Z„ I such that for A = lJfeLi(ii 0) 

(^^(v^-'^'Vi:^^) (A) < ^ 1 for n^oo. 

For this it is enough to construct sets f2„ of increasing /i^-probability so that 
on this set the following ratio converges to zero 



2t 



E.exp(-lE;=ii-|0-2P^ 



exp (-i e;=i h 7s^^ - 5 t:U 1 ~ 2a - 
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The /i^-probability of |E"=iCjl > Cns+'r and |E"=ifjl > Cn2+^ are expo- 
nentially small in n. Thus it is enough to consider 

exp {~lnl~ 2P) _ Ef exp (-Ini - 2fc^) + E:^exp (-inl ~ 2P) 
E. exp (-in^ - exp (-^n^L _ fc2) + ^oo^exp - 

< max |cxp(— --^/n), exp(— n)|- ^ as n — > oo. 

3.2 The Small Noise Limit for Functional Response 

We consider the following regression model 

y = a+^C (3.4) 

where y, a and are elements of a Hilbert Space H. We assume that observa- 
tional noise is a Gaussian random field /i^ = A/'(0, F) on H. We consider the 
Hilbert scale ("H*, (•, •)j) defined with respect to F. The Cameron-Martin space 
of /ij corresponds to TL^ . From now on we make the following assumption 

Assumption 17. (Prior Smoothness) 

There is an s such that the prior satisfies 

M^n = 1- 

In this case the posterior is well defined and takes the form |33| 



^ = C'(n,^)exp(-in||a||p + n(a,y)p). 

Proving posterior consistency corresponds to performing an identical twin ex- 
periment with data generated from a fixed "truth" 

y = a^ + -^t (3.5) 

By changing the normalising constant the posterior can be written as 

^ - C{n,Oexp (-| \\a- a^\\l + ^A^(a - a\^)^) . (3.6) 



The expression above suggests that the posterior concentrates in balls around 
the truth in the Cameron-Martin norm. First we consider priors that have an 
a.s. uniformly upper bound on || • H^-norm. This is the case for the priors 
considered in Section [2j In a second step we assume that the prior has higher 
exponential moment. In case of a Gaussian prior we show that our rate is close 
to the optimal minimax rate obtained in |24| . Before stating and proving our 
main results, we introduce an assumption on the observational noise. 
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3.2.1 Properties of the Observational Noise 

The smoothness of the noise in terms of the Hilbert scale {Ti, \) is determined 
by the decay of the eigenvalues A^. of F. Subsequently we assume 

Assumption 18. There exists do such that is trace class for all a > a^. 
That is 

f:\r<oo. 

k=l 

We quote the following Lemma from [T]: 
Lemma 19. Under Assumption \18\ we have: 

1. Let C, be a white noise. Then ]E||r5^|| < oo for all a > ctq. 

2. Let u ^ /iQ. Then u G Ti.^^'^ ^Q-a.s. for every a > ctq. 
Proof. 

1. We have that T^C - 7V(0,r'"), thus E||r|c|| < oo is equivalent to V 
being of trace class. By the Assumption [18] it suffices to have a > ctq. 

2. We have Eyr'^ulp = E||rf r-^ujp ^ E||rf CIP, where ( is a white 
noise, therefore using part 1 we get the result. 

□ 



3.2.2 Posterior Consistency for Uniformly Bounded Priors 

The following Theorem has important applications in Section [2] 



Theorem 20. Suppose Assumption 11 with balls in TL^ , Assumption 17 and 
Assumption\l 8\ are satisfied and that s > 1 + (Tq. // 



, ^ U fiQ a.s. and <E Us 



then for any k < min | 2(2-A) ' 2Tp } ^ 
consistent in Ti,^ with rate n^'^ . 



s-1 



(3.7) 

the posterior /i^" is 



Proof. We choose sets il„ such that /ij(fi„) — )■ 1 and 



uniformly in ri„ as n oo. 



(3.8) 



The result then follows because iiy^{Bl^^^{a'^))+^jy^^{Bl^_^{a^Y) ^ 1. We deal 
with (a — ^ by smoothing ^ at the expense of a — 



r + 2 (a-a'),r 2 ^ 



I l+o"o+7 
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Using Fernique's Theorem |3] and Markov's inequality, we know that there exist 
subsets ri„ C 17 such that 



|(a-at,e)J < ||a-at||^^^^^^^<foralUeJ^„ 
Pd^n) > 1 ^ 



exp(4T?r-f'^«) 



Interpolating between and H'' for s > 1 + ctq + 7 yields jTU] since 
|(a — a^, I < K^^ II a — II ||a — 



A II +i|l-A II |||A 

< A„ a — a' L 



with A = ''~\r_7~^ . Using Equation (|3.6l) yields 



C(n,^) inf exp(~n||a- a'flP - ^A^(a - a^^) )/io(BL_,) 



> C(n, ^) exp(— r 



(3.9) 



-a^\\l{lf^K,,n'^-'-C-)')MBL-.). (3.10) 



Similarly, we also obtain an upper bound 



(^en-~('^^)) — ^('^'^) ^^P exp (— n||a — a^|P + ifn^/n ||a — a'''||^ 



The expression in the exponential in (3.6 1 can be written as f{\\a — a^||) with 
f{d) — —ncP + KnTi^d^ which is decreasing on [(if„An~5/2)2^, cxa]. We sup- 
pose that 

(3.11) 



2(2- A) 



which for n large enough leads to 

^J.y {Bl^-^a^)) < C(n,Oexp (-e^n^-^^ 



(3.12) 



We now derive sufficient conditions for n 



1-2k 



to be the dominant term in the 



exponential in Equations (3.101 and (3.121 which implies Equation (3.8l. This 
is the case if in addition to Inequality (3.11) 



1 - 2k > - - kA 



hold. The first line is equivalent to Inequality (3.111 and the second line is 
implied by Assumption [TT] if 

1 - 2k > Kp. (3.13) 



Now the Inequalities (3.111 and (3.131 imply Statement (3.8l. Letting 7 — > 
concludes the proof. □ 
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The assumptions on the prior tails can be weakened to supposing exponential 
moments of || • || . The price we pay is that the rate of convergence is determined 
by a low-dimensional optimisation problem. 

Theorem 21. Suppose that the truth is in Ti,^ and that Assumpt 



■ion 
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with halls in Ti} , Assumption 11 and Assumption 18 are satisfied. Additionally 
assume J exp(3/ ||a||^)d/xo(a) < oo for / > 0, e > and s > 1 + ctq- U the 
following optimisation problem has a solution k* > 

Maximize 

with respect to K*,p>l,ry,0>O subject to 



i-2k> - + ?7--kAp (lasi 

2 q ^ ' 

l^2K>^-i] + {l-X)q0 (jell 



pK,<ee (C.6I 



1-2k>pk (C.7I 



Aj3<2 (C.8I 



(1-A)g<e (C.13I 



1 \ / 1 



7/ ) ( 1 + ) < max(l - 2k, 9e) ( [016 1 



where A ^ s^-^i"^" ' then for any k < k* the posterior /i^" is consistent in 
with rate rt^". 

Proof. See Appendix. □ 

The conditions for posterior consistency without an algebraic rate are more 
straightforward. 



Corollary 22. Sppose that the truth is in Tig and that Assumptions 17 and 
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are satisfied. Additionally assume J exp(3/ ||a||^)(i/io(a) < oo for / > 0, 



e > and s > 1 + cto. // one of the following two conditions holds 



< A < i and e> -1 + 2\/2\/l - A or 

- < A < 1 and e > 2 - 2A 
2 

where A :— ^^^^f^. Then for any e there exists a sequence In t 1 such that 



(^^" [Be (a^)] > In) ^ I as oo. 
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Proof. 



From the proof of Theorem 21 we only have to find ij, 9 > 



s such that Inequalities (C.3l 



Choosing rj as large as inequality (C.3l permits, that is 77 := 



(C.8I, (C.13I and 



p > land 
are satisfied, 
extends 

the range of solutions of other i nequ alities ((C.4| and (C.I61) containing 77. 



Similarly, choosing 9 as large as (C.4| permits permits that is 9 



dc.iei 

T 

2ip-l) 



0.5+rj 



extends the range of solutions of the inequality (C.16 1. Letting e — in (|C.16 1 
yields 



b-2)( 



(1- 

p-i 

e(p-l) + (A-l)p 

2{p-l) 



P>1 

Xp<2 
X)q < e 

-1' 



(C.8I 



(C.13I 



< max 1 



2A 



(3.14) 



Now it is left to perform case-by-case analysis 
the first cases are 



< 1 and 



> 1. 



Starting from inequality 
Inside these cases we have to 
1 + A) > separately in 
order to rearrange (3.141 to a quadratic inequality in p. The details are tedious 
but straightforward algebra. □ 



([3141, ... .... ..... ... . . ... 

treare(-l + p) +p(-l + A) < and e(-l + p) + p{ 



3.2.3 The Posterior Consistency Rate for Gaussian Priors 

In all results about posterior consistency in this article as well as the literature 
the small ball probability has to decay sufficiently slowly in order to obtain 
a rate. Since sufficient asymptotics are rarely available [26], we evaluate the 
posterior consistency rate in the Gaussian case by comparing it with the optimal 
minimax rates obtained in [24] for the special case of jointly diagonalisable prior 
and noise covariance. Even though we obtain rates under weak assumptions our 
rates a close to the optimal rate in this special case. 



Posterior Consistency Rates for Gaussian Priors 

Suppose the prior is Gaussian that is /iq = Af{Q,Co) and the eigenvalues and 
of Co and the covariance operator F satisfy respectively 

- r* (3.15) 

A, = r'^ (3.16) 

where ej is an orthonormal basis of eigenvectors for both operators. The inner 
product of the Hilbert scale is now explicitly given by 



-2r„ 



Moreover, Assumption 
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in Section 



3.2.1 



is satisfied with do 



Cameron-Martin norm of F corresponds to 



-. Again the 
The covariance operator Co 
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of fiQ on has eigenvalues /ijl-H^ = j This can be seen by denoting 

iS'^'efe = fc^gfe and calculating 

{x, u)^^ {x, v)^^ = E,,„ (x, S^'^u)^ (x, S^'^v)^ (3.17) 

We need t > rs + 1 for Cq to be trace class on Hs- In this case we know by 
Example 2 and Proposition 3 in Section 18 that 

log (mb\) 



Corollary 23. Let the prior and the observational noise be specified as in {3.15) 



and (3.16). Suppose Assumption 11 with balls in Ti.^ , Assumption 17 and As- 
sumption 18 are satisfied. Additionally we suppose that G "Hj for s < 



If the following optimisation problem has a solution k* > 
Maximize Ki, 

with respect to K*,p > l,ri,9 > 0, s < subject to 



1 p 

1 — 2k > — K ?7 kXp 

2 q 

1-2k > ^-f]+{l-X)qe 
^ -K < 20 



I -2k > K (3.18) 

i — r — 1 

q 2 

Xp < 2 
(l-A)g < 2 

2-'^ (^+ 2-(l'-A), ) < --(l-2.,^.2) 

where X := ^^TzFp^- Then for any k < k* the posterior /i^" is consistent in 
with rate n^'^ . 

Comparison to Optimal Rate 

In order compare our posterior consistency rates with [24] we cast our problem 
in their setting and notation. Let C be _ff-valued white noise, our problems 
corresponds to recovering a from 
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This is equivalent to the problem that 

Y = Ka+^( (3.19) 

where K = ri . Let {/„} be an orthonormal basis of eigenvectors of T onV.. In 
order to adapt to the notation of [24 , we call H2 := H. Hi will be equivalent 
to the Cameron-Martin space. In their notation it is equivalent to take 

Hi = S^H, ■■= {v e H2\v = Y.vJ, s.t. < 00} 

with orthonormal basis Cfe — fk/n^- We consider K : Hi ^ H2 

Kck := r ^eu = ^fk- 

In order to match Assumption 3.1 in |24| . we have to bound the eigenvalues Ki 
of K'^K like 

C'^rp <K,< Ci-p. 

We determine these eigenvalues by noting that 

{K^fk,e,)^^ = {fk.Ke,)^^ = 5jk^. 
The calculation above yields 

K^Kfk-^^'-'^' 



^k^ ^ 



1 = ^ p = 0. 



We identify the covariance operator of /ip on "Hi and its eigenvalues as in (3.171 
By Theorem 4.1 of [21] the posterior contraction rate 

qA/3 
Yl 1+2Q + 2P 

where — 1 — 2a = —2t + 2r and (3 is the regularity of the truth. Supposing that 
/3 > a, results in 



'^opt 



2(< - r) - 1 

In Figure [ST] we use numerical optimisation to compare our rate to the optimal 



one for r — 1 and varying t. Just considering Inequality (3.181 yields 

^h°P°= 2('t-''r')-l 



which is a better rate than obtainable by our method (see Figure 3.11. Thus 
even if we are able to improve our bounds, there is a genuine gap between our 
rate and the optimal rate. The reason for this gap is the use of infima and 
suprema over balls in the proof of Theorem [21] 
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4 5 6 7 



Figure 3.1: Rate comparison in the Gaussian case 



3.3 Convergence in Stronger Norms 

We show how the apphcation of an interpolation inequality can strengthen the 
norm in which the posterior concentrates. We consider here the nonparametric 
Bayesian regression of functional responses with a Gaussian random field as 
noise. The setting of pointwise observations in the large data limit can be dealt 
with similarly for example using interpolation between the Holder spaces. 

Suppose we know that the posterior concentrates in the Cameron-Martin 
norm ||-||-, . In order to show consistency in ||-||^, we write 



1 fll tll^ll tl|i-^ 1 

> ej C ||a — a' II ||a — a' ^ f 

c {ll«-«1^>;J}u{||«-«ir'>^}- 

The posterior of the first set is small due to the posterior consistency in "H^, The 
posterior of the second set is small due to tails of the prior and the posterior. 
Obtaining estimates of this type can be done similarly to the steps subsequent 



to (C.12) in the proof of Theorem 21 



4 Conclusions 

We have established a novel link between stability results for inverse problems 
and posterior consistency for the Bayesian approach to them. We have explicitly 
established this link for the elliptic inverse problem but the same method is gen- 
erally applicable. An instance is Electrical Impedance Tomography (Calderon 
problem) where stability results are available |2] . This example would lead to a 
very slow convergence rate since the stability result is weak. 

The elliptic inverse problem has also been considered with log-Gaussian pri- 
ors Log-Gaussian measures have moments of arbitrary order but no ex- 
ponential moments. However, we need exponential moments for the Bayesian 
regression of functional response and for pointwise observations (see also Section 
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4.2.2 in For this reason it is harder to show posterior consistency in that 
case. This is a problem we like to persue in the future. 



A Notation and Review of Technical Tools - Gaus- 
sian Measures and Hilbert Scales 

A.l Gaussian Measures 

In this section we set out our notation for some standard results about infinite 
dimensional Gaussians which can be found in |31 Ul [T3] . Let 7 be a Gaussian 
measure on a Hilbert space {H, (•,•)). It is characterised by its mean given by 
the Bochner integral 

m = xd^{x) 
Jh 

and the covariance operator C : H ^ H characterised by the relation 
{Cu,v) — / {u ~ m, x) {v — m, x) dj{x). 



From this it is clear that the covariance operator is positive definite and self 
adjoint. Moreover, we note that C is trace-class and a draw from a Gaussian 
can be expressed through eigenvalues and the corresponding eigenbasis (f>h 



7 = C{m + J2 Afe^fcCfc) with a - AA(0, 1). 

i=l 

The Cameron-Martin space associated with 7 is 

= {x\x = Xi(j)i s.t. Y^^i °°J' ^ 
equipped with the inner product for x = '^Xicjji and y — ^Uii' 



A. 2 Hilbert Scales 

In order to measure the smoothness of the noise and draws of the prior, we 
introduce Hilbert Scales [TD]. Let F be a self-adjoint, positive-definite, trace 
class linear operator with eigensystem (A^, (j)k)- We know that F^^ is a densely 
defined, unbounded, symmetric and positive definite operator because 

H = ® Ker(F)-L = n(T). 
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We define the Hilbert scale by {{H\ {■, with := M" "* for 



oo 



M := f] V{T-") 

n=0 

\\ul := \\t-'^u 

Furthermore, we will denote balls with respect to the 

BUu) = {x\\\u-xl<R}. 

B Change of Variables for Bayesian Inverse Prob- 
lems 

The state of a model can be described in several ways. In this section we present 
the resulting relationship between two different descriptions of the same model: 

Theorem. Suppose Qn ~ On °G with G : {X,\\\\x) (y, ||||y) and O : 
(Y,||||y) — > (Z, llll^). Furthermore, assume that the posterior fi'-'" (fi^ ) is well- 
defined for the forward operator Q(0 ), the prior iio{da) (jjo{dp)) and the noise 
^ ~ .AA(0, r). It is given by 

rf/U^" f 111 ||2 

-^(a) cx exp ( -2l|^'(a)||r + iV'Sia))^ 



^(p)cxexp(-i||0(p)||J + (2/,0(p))p 



In this case Gi^fi^" = fl^. 

Proof. It is sufficient to show that both measures agree on all sets A e B{Y) 

{G,^J,y-){A) = JidG,ny-{da). 

A 

By the transformation rule 



G-i(A) 

J c-exp{-^\\0{G{va))-y\\l)diJ.o{a) 

G-i(A) 



/ 



1 



c- exp(-^ \\g{v) - y\\l)dF^ij,o{v). 



□ 
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C Proof of Theorem 
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Proof of Theore m\21\ We follow the same steps as in the proof of Theorem |20] 
up to Equation |3.9| reading 



il-A 



(3.91 



with A = ^ ^s-i ^ • -'^'-'^ separate the product using Young's inequality 

with i + i = 1 

p q 



< Kn\\a ~ a^\\-^^ + n ''(;^ ~ ct^Hs) 



til 



(C.l) 



where, for simplicity of notation, we used 
Lower bound on /^^" (B^„-k (a^)): 



P 



The following lower bound on /^^" (^g„-K (a^)) is based on Equation (C.l I 
^^"(i?i„_.(at)) 



> 



exp 



The term n has to be dominant in Equation (C.2) because the same expo- 



nent is appearing in Equation (C.IO) except for a larger coefficient. Choosing 



R ~ n and substituting the expression for Kn, this is the case if 

1 p 

1 — 2k > - + 7? nXp 

2 q 

l-2K>^-r7+(l- X)q0 



(C.3) 

(C.4) 
(C.5) 



We need small ball probabilities (see Assumption 11 1 and the exponential mo- 



ments of fiQ to obtain explicit sufficient conditions on k. We first note that 



Equation (C.5) holds if 



pK < eO 
1 — 2k > pK. 



(C.6) 
(C.7) 
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Upper bound on /i^" (i3^^^_„(a^)'^): 
We bound fiV" (a^)'=) by 

^y- (i?,V^(«^r) <A^^"(sL-»(«^)^ n 51,(0)) + i3i„_.(a^)^nB|j(0)^ 

Upper bound on /x?^" (Bi„_„(a1')'= n B|j.(0)): 

We denote by Mgi (ai)^nB%{0) the following supremum 

sup lla - o'l'll^ + Vn^n ||a - a"f||^'' + n5"''' ( ^ lla - a^'l ) (i"^)? 

Bi^„^(at)<=ns^(o) 2 \2 "J 

which is finite if 

Xp < 2. (C.8) 
The first two summands above can be rewritten as /(||a — || j^) where 

f{d) = -'^d^ + V^Kr,d^P. 

By considering /', we see that / is decreasing for d > {^KnXpn^^^^^ . Thus for 

en"" > {KnXpn-^-)^^ (C.9) 



the following inequality holds 
^iy- {Bl^-. {o)Y n B|,(0)) < C(n, exp 



2 

„i-.(i?(l-A)?+||„t||(l-^)9) 



+ n^-^P'^kne^P+ (C.IO) 



For large n Equation (C.9 1 is implied by 

{v- - h^p < (c.ii) 

q z 

Upper bound on /i?^" {Bl^_^{a^Y f] B%{QY): 

In this section we bound ii^" {Bl{a^Y n B^{OY) using Markov's inequality 
in combination with the exponential moments of the prior 



M^"(exp(/|Mi:)xBi^_(,t)c)< J C(n,0exp(n5-'^||a|l,(i-^)') 

exp(-| \\a - a^ll + V^Kn \\a - a^^/ + n^"" \\a^\[''^^'')d^loia). (C.12) 

We denote the term appearing in the exponential in the second line by Tq. 
It can be bounded similar to the upper bound on /i^" (^en-'= C*^^)^ ^ ^B.i^)) 

To < ilT„ := -^£2 ^ ^h-^P>^Kr.^Xp + n\-v II at II ^'-^'^ . 
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We denote by =<! an inequality with a multiplicative constant not involving n or 



K. In order to get an upper bound for (C.12I we bound the exponential moment 



/x^"(exp(/ |M!:)xsi^_j,t)e) =^ C(n,0 / exp (n^-'J ||a||i^-^)« + / +ilTo) d^l^ 



Introducing 

g{r) =n5-"r(i-^)« + /r*= 
g'[r) = - \)qA^-^^''-^ + efr" 

and performing an integration by parts, we get 

M''"(exp(/|MpXsi(at)0 



CM 



^ / exp(.g(|la|lj +ilTo)dMo(a) 



^ exp(ilT,J 



g'{r) exp{g{r))dr 



4 / g'{r)exp{g{r) +i^To)d^J.o{\\a\\s > r)dr 
Jo 

4 g'iR) exp (n^^V^i-^)" - 2/r'=) dr. 

The above can only be expected to be finite if 

{l-\)q<e. (C.13) 
Moreover, we will assume that tj < ^ since otherwise 

exp fns-" llalli^-')" + / ||a||:) dfi^ia) 4 f exp (2/ ||a|i:) df^oia). 



In order to achieve an upper bound, we split the term in the exponential into 
Ti — n^^'^r^^^^)'} — fr*^ and T2 = —fr'^- The first term is negative whenever 



r >r, := 



For n large enough > 1 holds. On the interval [0, Sz] an upper bound on 
the maximum value of Ti can be derived as follows: 



(C.14) 
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Putting everything together gives rise to 
/xf''(exp(/|HQXBi_j„t)c) 



f 

J r,. 



CM 

[ \ni-'>{l - A)gr(i-^)9-^ +e/r^-^)exp(ilTi +iiTo)dr 
Jo 

(n5-''(l - A)gr(^-^)«-i + e/r"-i) exp (-/r'^ + Uto) dr 
4 n"exp(itTi +ilTo) 
for some a. Using Markov's inequaUty, this yields 

M^" (B^a^rnB^Oy) 4 C(n, On'"" exp (il^o ' fR') (C.15) 

Again substituting R = n^, this is asymptotically smaller than exp(— " 2 4) ^ 
^ + -^-——]<max{l-2K,ee). (C.16) 



2 7 V e-{l-X)q^ 

Collecting the inequalities from above, we see that the results follow by 
letting 7 — )• 0. 

□ 
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